Search CORE

322 research outputs found

Estimation and prediction of the vehicle's motion basedon visual odometry and Kalman filter

Author: B. Musleh
D. Lowe
D. Scaramuzza
D. Scharstein
I. Parra
J. Borenstein
R. Kalman
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

Proceeding of: 14th International Conference, ACIVS 2012, Brno, Czech Republic, September 4-7, 2012The movement of the vehicle is an useful information for different applications, such as driver assistant systems or autonomous vehicles. This information can be known by different methods, for instance, by using a GPS or by means of the visual odometry. However, there are some situations where both methods do not work correctly. For example, there are areas in urban environments where the signal of the GPS is not available, as tunnels or streets with high buildings. On the other hand, the algorithms of computer vision are affected by outdoor environments, and the main source of difficulties is the variation in the ligthing conditions. A method to estimate and predict the movement of the vehicle based on visual odometry and Kalman filter is explained in this paper. The Kalman filter allows both filtering and prediction of vehicle motion, using the results from the visual odometry estimation.This work was also supported by Spanish Government through the CICYT projects FEDORA (Grant TRA2010-20255-C03-01), Driver Distraction Detector System (Grant TRA2011-29454-C03-02) and by CAM through the projects SEGVAUTO-II.Publicad

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Universidad Carlos III de Madrid e-Archivo

Using strong shape priors for stereo

Author: D. Scharstein
Press
R. Szeliski
V. Kolmogorov
V. Kolmogorov
Y. Boykov
Publication venue
Publication date: 01/01/2006
Field of study

Abstract. This paper addresses the problem of obtaining an accurate 3D reconstruction from multiple views. Taking inspiration from the recent successes of using strong prior knowledge for image segmentation, we propose a framework for 3D reconstruction which uses such priors to overcome the ambiguity inherent in this problem. Our framework is based on an object-specific Markov Random Field (MRF)[10]. It uses a volumetric scene representation and integrates conventional reconstruction measures such as photo-consistency, surface smoothness and visual hull membership with a strong object-specific prior. Simple parametric models of objects will be used as strong priors in our framework. We will show how parameters of these models can be efficiently estimated by performing inference on the MRF using dynamic graph cuts [7]. This procedure not only gives an accurate object reconstruction, but also provides us with information regarding the pose or state of the object being reconstructed. We will show the results of our method in reconstructing deformable and articulated objects.

CiteSeerX

Crossref

Deep Eyes: Binocular Depth-from-Focus on Focal Stack Pairs

Author: A Newell
AN Rajagopalan
D Scharstein
H Park
J Zbontar
M Moeller
R Held
S Kuthirummal
SW Hasinoff
Yang Yang
YY Schechner
Publication venue
Publication date: 10/08/2020
Field of study

Human visual system relies on both binocular stereo cues and monocular focusness cues to gain effective 3D perception. In computer vision, the two problems are traditionally solved in separate tracks. In this paper, we present a unified learning-based technique that simultaneously uses both types of cues for depth inference. Specifically, we use a pair of focal stacks as input to emulate human perception. We first construct a comprehensive focal stack training dataset synthesized by depth-guided light field rendering. We then construct three individual networks: a Focus-Net to extract depth from a single focal stack, a EDoF-Net to obtain the extended depth of field (EDoF) image from the focal stack, and a Stereo-Net to conduct stereo matching. We show how to integrate them into a unified BDfF-Net to obtain high-quality depth maps. Comprehensive experiments show that our approach outperforms the state-of-the-art in both accuracy and speed and effectively emulates human vision systems

arXiv.org e-Print Archive

Crossref

Semantically Guided Depth Upsampling

Author: A Geiger
A Kundu
D Scharstein
J Kopf
J Liu
K He
K Yamaguchi
L Ladický
M Everingham
M Kiechle
P Dollar
Publication venue
Publication date: 02/08/2016
Field of study

We present a novel method for accurate and efficient up- sampling of sparse depth data, guided by high-resolution imagery. Our approach goes beyond the use of intensity cues only and additionally exploits object boundary cues through structured edge detection and semantic scene labeling for guidance. Both cues are combined within a geodesic distance measure that allows for boundary-preserving depth in- terpolation while utilizing local context. We model the observed scene structure by locally planar elements and formulate the upsampling task as a global energy minimization problem. Our method determines glob- ally consistent solutions and preserves fine details and sharp depth bound- aries. In our experiments on several public datasets at different levels of application, we demonstrate superior performance of our approach over the state-of-the-art, even for very sparse measurements.Comment: German Conference on Pattern Recognition 2016 (Oral

arXiv.org e-Print Archive

Crossref

Stereo Computation for a Single Mixture Image

Author: A Levin
A Levin
AM Bronstein
B Li
C Liu
D Scharstein
J Xie
Jingyu Yang
O Ronneberger
R Garg
R Zhang
S Iizuka
Z Wang
Publication venue
Publication date: 27/08/2018
Field of study

This paper proposes an original problem of \emph{stereo computation from a single mixture image}-- a challenging problem that had not been researched before. The goal is to separate (\ie, unmix) a single mixture image into two constitute image layers, such that the two layers form a left-right stereo image pair, from which a valid disparity map can be recovered. This is a severely illposed problem, from one input image one effectively aims to recover three (\ie, left image, right image and a disparity map). In this work we give a novel deep-learning based solution, by jointly solving the two subtasks of image layer separation as well as stereo matching. Training our deep net is a simple task, as it does not need to have disparity maps. Extensive experiments demonstrate the efficacy of our method.Comment: Accepted by European Conference on Computer Vision (ECCV) 201

arXiv.org e-Print Archive

Crossref

The Australian National University

A Comparative Study of Energy Minimization Methods for Markov Random Fields with Smoothness-Based Priors

Author: A. Agarwala
C. Rother
D. Scharstein
M. Tappen
O. Veksler
R. Szeliski
R. Zabih
V. Kolmogorov
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Tuning of Adaptive Weight Depth Map Generation Algorithms Exploratory Data Analysis and Design of Computer Experiments (DOCE)

Author: A. Hosni
A. Hoyos
Alejandro Hoyos
C. Daniel
D. Scharstein
D. Scharstein
D.C. Montgomery
Diego Acosta
F. Tombari
Iñigo Barandiaran
J. Congote
John Congote
K. Yoon
L. Wang
M. Gong
Manuel Graña
Oscar Ruiz
R.V. Lenth
S. Battiato
S. Battiato
Z. Gu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 18/11/2016
Field of study

In depth map generation algorithms, parameters settings to yield an accurate disparity map estimation are usually chosen empirically or based on un planned experiments -- Algorithms' performance is measured based on the distance of the algorithm results vs. the Ground Truth by Middlebury's standards -- This work shows a systematic statistical approach including exploratory data analyses on over 14000 images and designs of experiments using 31 depth maps to measure the relative inf uence of the parameters and to fine-tune them based on the number of bad pixels -- The implemented methodology improves the performance of adaptive weight based dense depth map algorithms -- As a result, the algorithm improves from 16.78% to 14.48% bad pixels using a classical exploratory data analysis of over 14000 existing images, while using designs of computer experiments with 31 runs yielded an even better performance by lowering bad pixels from 16.78% to 13

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositorio Institucional Universidad EAFIT

Luminance, colour, viewpoint and border enhanced disparity energy model

Author: A Wade
BC Skottun
D Scharstein
D Scharstein
D Scharstein
DY Ts’o
EN Johnson
F Mutti
GD Field
GF Poggio
H Filippini
Hans du Buf
HEM den Ouden
I Ohzawa
J Krauskopf
JA Martins
JA Martins
Jaime A. Martins
JC Read
JCA Read
JCA Read
JCA Read
JCA Read
JJ DiCarlo
JJ Tsai
JMF Rodrigues
JMF Rodrigues
JMF Rodrigues
JMH du Buf
João M. F. Rodrigues
L Nalpantidis
MS Banks
N Pugeault
R Szeliski
RM Haefner
S Tanabe
S Yang
TS Lee
Y Chen
YX Fu
Zhong-Lin Lu
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2015
Field of study

The visual cortex is able to extract disparity information through the use of binocular cells. This process is reflected by the Disparity Energy Model, which describes the role and functioning of simple and complex binocular neuron populations, and how they are able to extract disparity. This model uses explicit cell parameters to mathematically determine preferred cell disparities, like spatial frequencies, orientations, binocular phases and receptive field positions. However, the brain cannot access such explicit cell parameters; it must rely on cell responses. In this article, we implemented a trained binocular neuronal population, which encodes disparity information implicitly. This allows the population to learn how to decode disparities, in a similar way to how our visual system could have developed this ability during evolution. At the same time, responses of monocular simple and complex cells can also encode line and edge information, which is useful for refining disparities at object borders. The brain should then be able, starting from a low-level disparity draft, to integrate all information, including colour and viewpoint perspective, in order to propagate better estimates to higher cortical areas.Portuguese Foundation for Science and Technology (FCT); LARSyS FCT [UID/EEA/50009/2013]; EU project NeuroDynamics [FP7-ICT-2009-6, PN: 270247]; FCT project SparseCoding [EXPL/EEI-SII/1982/2013]; FCT PhD grant [SFRH-BD-44941-2008

Crossref

Directory of Open Access Journals

PubMed Central

Sapientia

Generalized Multi-Camera Scene Reconstruction Using Graph Cuts

Author: A. Laurentini
D. Scharstein
K.N. Kutulakos
L. Ford
R. Cipolla
R. Szeliski
R. Szeliski
R.K. Ahuja
S. Barnard
S. Geman
S.M. Seitz
W.N. Martin
Y. Boykov
Y. Boykov
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2003
Field of study

Reconstructing a 3-D scene from more than one camera is a classical problem in computer vision. One of the major sources of difficulty is the fact that not all scene elements are visible from all cameras. In the last few years, two promising approaches have been developed [. . .] that formulate the scene reconstruction problem in terms of energy minimization, and minimize the energy using graph cuts. These energy minimization approaches treat the input images symmetrically, handle visibility constraints correctly, and allow spatial smoothness to be enforced. However, these algorithm propose different problem formulations, and handle a limited class of smoothness terms. One algorithm [. . .] uses a problem formulation that is restricted to two-camera stereo, and imposes smoothness between a pair of cameras. The other algorithm [. . .] can handle an arbitrary number of cameras, but imposes smoothness only with respect to a single camera. In this paper we give a more general energy minimization formulation for the problem, which allows a larger class of spatial smoothness constraints. We show that our formulation includes both of the previous approaches as special cases, as well as permitting new energy functions. Experimental results on real data with ground truth are also included.Engineering and Applied Science

CiteSeerX

Crossref

Harvard University - DASH

Preservation of Architectural Heritage through 3D Digitization

Author: Balodimos D.
Beraldin J.-A.
Boehler W.
Hanke K.
Kersten
Livieratos E.
Scharstein D.
Sgrenzaroli M.
Sormann M.
Suveg I.
Tsioukas V.
Tsioukas V.
Van Gool L.
Publication venue: 'Multi-Science Publishing Co. Ltd.'
Publication date
Field of study

Crossref